Knowledge Retrieval and the World Wide Web

نویسندگان

  • Philippe Martin
  • Peter W. Eklund
چکیده

L ARGE-SCALE WEB SEARCH engines effectively retrieve entire documents, but they are imprecise, because they do not exploit and hence retrieve the semantic Web document content. We cannot automatically extract such content from general documents yet. Manually structuring Web documents— for example, with XML—lets us retrieve more precise information using stringand structure-matching tools, such as the Web robots Harvest, WebSQL, and WebLog. However, this approach is not scalable, because it only retrieves fine-grained information if the documents are thinly structured and the querier knows their structures, exact names, and forms. Knowledge representation languages that support logic inference can help us achieve more flexible and precise knowledge representation and retrieval. Industry is currently developing many metadata languages to let people index Web information resources with knowledge representations (logical statements) and store them in Web documents. However, these metadata languages are insufficient to satisfy several requirements necessary to allow precise, flexible, and scalable information retrieval. On the basis of ease and representational completeness, we argue in favor of general and intuitive knowledge representation languages such as conceptual graphs (CGs)1 rather than the direct use of XML-based languages. To let users represent knowledge at the level of detail they require, we propose simple notations for restricted knowledge representation cases and a technique that lets users leave knowledge terms undeclared. We built a Web-accessible tool (CGI server), WebKB,2,3 to support this approach and let its users combine lexical, structural, and knowledge-based techniques to exploit or generate Web documents. WebKB is an ontology server and directed Web robot. (See the sidebar for a list of related URLs.)

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prioritize the ordering of URL queue in Focused crawler

The enormous growth of the World Wide Web in recent years has made it necessary to perform resource discovery efficiently. For a crawler it is not an simple task to download the domain specific web pages. This unfocused approach often shows undesired results. Therefore, several new ideas have been proposed, among them a key technique is focused crawling which is able to crawl particular topical...

متن کامل

WebWatcher: Knowledge Navigation in the World Wide Web

Many have noted the need for software to assist people in locating information on the World Wide Web. Although effective tools exist, they typically rely on brute-force scanning and indexing of Web pages for later keyword-based retrieval. Such tools ignore at least two sources of knowledge which might prove useful in navigation and retrieval: (1) the structure of the Web as a graph, and (2) the...

متن کامل

The Semantic Web: A Vision or a Dream?

The Semantic Web strives to be a machine readable version of the World Wide Web in which web sites have meaningful content. The Semantic Web uses improved information retrieval, metadata, annotation, and ontologies to enhance knowledge management. ew web services will be possible, because data will be shared at a semantic level rather than a syntactical level. The Semantic Web will use XML, RDF...

متن کامل

Extracting Semantics with Lexical and Ontological Knowledge Sources: A Methodology for Context-Aware Information Retrieval from the World Wide Web

The continued growth of the World Wide Web has made the retrieval of relevant information for a user’s query increasingly difficult. A major obstacle to more accurate and semantically sound retrieval is the inability of web search systems to incorporate context in the retrieval process. This research presents a methodology to increase the semantic content of web query results by building contex...

متن کامل

Integration of Semantic Web and Knowledge Discovery for Enhanced Information Retrieval

Knowledge management is a process which comprises knowledge discovery, knowledge collection , knowledge organization and knowledge process. Among these four process knowledge discovery is integrated with semantic web for enhanced information retrivel. Knowledge discovery is the process of automatically searching large volume of data for patterns that can be considered knowledge about the data. ...

متن کامل

Post-web cognition: evolving knowledge strategies for global information environments

This paper considers the changing cognitive demands of the web as a representative of emerging pervasive and virtually instantaneous global information access environments. Because information retrieval time on the web approaches that of human memory the appropriate knowledge strategies for seeking and remembering information begin to change. There is strong existing theoretical work on search ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IEEE Intelligent Systems

دوره 15  شماره 

صفحات  -

تاریخ انتشار 2000